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We show that log-periodic power-law (LPPL) functions are intrinsically very hard to fit to time 
series. This comes from their sloppiness, the squared residuals depending very much on some 
combinations of parameters and very little on other ones. The time of singularity that is supposed 
to give an estimate of the day of the crash belongs to the latter category. We discuss in detail why 
and how the fitting procedure must take into account the sloppy nature of this kind of model. We 
then test the reliability of LPPLs on synthetic AR(1) data replicating the Hang Seng 1987 crash 
and show that even this case is borderline regarding predictability of divergence time. We finally 
argue that current methods used to estimate a probabilistic time window for the divergence time 
are likely to be over-optimistic. 



INTRODUCTION 



Log-periodic functions have received much attention 
because of the claim that they could be used to predict 
the times of singularities. While they are known to oc- 
cur in hierarchical discrete scale- free networks |36j . they 
have been claimed to have been observed in many types of 
natural time/size series: earthquakes HSl IHl EZI 1 ice- 
quakes [13^, forest fires [27, as well as evolutionary trees 
|30| . although such claims have not gone unchallenged 
|18j . But the most noticed application of such functions 
is to speculative bubbles of stock indices [U [51] , 
foreign exchange rates [23 , real estate |31| and commod- 
ity prices |1H I12| as well as downward spirals during 
the burst of the bubble [55] . Given the importance 
of such phenomena, and the possibly important conse- 
quences of finding a universal model that could be ap- 
plied to this remarkable variety of bubbles, it is of course 
necessary to assess the statistical signifiance of LPPLs re- 
garding crashes, i.e., the predictive power of log-periodic 
functions in this context. The question is still unsettled 
as of yet UHl [111111 [111351 [35] • Problems are indeed 
numerous: what definition of a bubble and a crash to 
adopt [Tni l26j [T|? should the price in a bubble always be 
increasing [6]?, should one impose contraints on the fit- 
ted parameters [55|? where to start a fit of a bubble[2j? 
what test of goodness of fit to use|3]? why having differ- 
ent lengths of the data window greatly affects the param- 
eters of the best fit of the LPPL to the data dH? why 
leaving out a few data points can alter the parameters 
of the best fit sufficiently to change a no/bubble decision 
(see e.g. [2S1 footnote 4])? why is the fitting error very 



sensitive to small (but not large!) changes in one of the 
parameters of the model [7]? 

What contributes to most if not all of these difficulties 
is that a stable best fit of an LPPL to the data is very 
hard to determine. Here we aim to show that this comes 
from the fact LPPLs belong to the family of sloppy func- 
tions, a terminology introduced in a series of papers by 
Sethna et al [H \T5\ \17\ [40] ; we will discuss in details what 
this means when applying LPPLs to noisy time series. 



SLOPPINESS 

Let us denote by p{t) the time series to be fitted, / 
the fitting function and $ the set of parameters. Least- 
squares fits minimise S — '}Zt=toif'ii^) ~ ~ 
to — n), where ti < tc, the time when the singular- 
ity (crash) occurs, and n is the number of free param- 
eters. The best fit $ to some given data corresponds 
by definition to the minimum of S, therefore, close to $, 



x,yeP dxdy 



{x — x){y—y). Assuming that 



$ does not sit on a boundary of the parameter space, the 
curvature of S' in a neighborhood of $ is positive; as a 
consequence, the Hessian is positive-definite, 

thus all its eigenvalues are positive. 

As shown recently in a series of papers (e.g. |51 [T51[r7l 
HO]), sloppy models are characterized by a separation of 
Hessian eigenvalues by orders of magnitude, which is all 
the more likely and evident when the models have many 
parameters. Combinations of parameters corresponding 
to larger eigenvalues are called stiff, while those corre- 
sponding to small eigenvalues are called sloppy. In other 
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Table I. Eigenvalues, A, and associated eigenvectors of the best fit of real price for the 1987 crash. Components of absolute 
value larger than 0.1 are in bold face. 
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Table II. Eigenvalues, A, and associated eigenvectors of the best fit of log price for the 1987 crash. Components of absolute 
value larger than 0.1 are in bold face. 



words, varying slightly a stiff parameter combination has 
a large influence on S, while changing sloppy combina- 
tions of parameters does not modify substantially S. This 
has two consequences, discussed in detail in the following 
sections: first, fitting sloppy functions must be done care- 
fully; second, out of sample predictions from the best-fit 
values of sloppy parameters may be imprecise: as the 
noise from sample to sample changes, the fitted values of 
sloppy parameters are likely to change greatly. 

Let us apply this reasoning to the fitting of financial 
index prices p{t) with log-periodic functions, as used orig- 
inally in this context in [34_, 

fLp{t) + B{U - tr[l + Ccos(log(i, - t) + 0))]. 

This is a seven-parameter fit, but as already noted 
in the original paper, minimizing S with respect to A, 
B, and C yields linear equations, which reduces the 
non-linear part of the fitting problem to four param- 
eters. However, sloppiness concerns a priori all seven 
parameters. This is why we shall keep them all, i.e. 
$ = {A, B, ic, a, C, uj, (j)}, in order to give a fuller account 
of sloppiness. Once we understand what respective im- 
portance A, B, and C have in S, we will be able to focus 
on the other parameters. 

It turns out that log-periodic functions are very sloppy: 
every crash we fitted resulted in a clear separation of 
eigenvalues by orders of magnitude. Let us take for ex- 
ample the 1987 crash in the Hang Seng index. The eigen- 
values and eigenvectors of the Hessian of the best fit ob- 
tained by using the Levenberg-Marquardt algorithm to 



fit the 834 days preceding the crash, and retaining the 
best of a set of 20000 initial conditions, are shown for 
the best fit to real prices in Table |T] and to log prices in 
Table ini 

These tables contain several relevant pieces of informa- 
tion. First, the largest eigenvalue is at least 9 orders of 
magnitude larger than the smallest one, a definite signa- 
ture of sloppiness; in addition, the eigenvalues are well 
spread over these orders of magnitudes. The associated 
eigenvectors confirm the wisdom that the stiffer a direc- 
tion, the more likely that it is close to an axis, and revere- 
sely for sloppy eigenvalues ^17j. Next, the eigenvectors 
vary from crash to crash and can be quite different be- 
tween real and log-prices: for the 1987 crash, the linear-fit 
parameters (A, B, and C) are completely disconnected 
from the other ones only in the case of real prices; curi- 
ously, this is not systematic, as both log and real prices 
of the 1997 crash lead to disconnected eigenvectors, for 
instance. When the eigenvectors associted to A, B, and 
C are not completely disconnected from the other four 
parameters, one should not fit them separately when es- 
timating the error associated with tc, for instance (see 
section Discussion below and p, 

We are of course chiefiy interested in the role of t^- 
this crucial parameter turns out to be one of the most 
sloppy parameters and, as a consequence, its associated 
eigenvector is not along the tc axis but also comprises 
the phase (j) and the frequency w, meaning that in order 
to fit tc precisely, one should take this inter-dependency 
into account properly, which is not the case in the state- 
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Figure 1. Eigenvalues associated to the four parameters requiring non-linear fitting as a function of time in days immediately 
before tc for the 1987, 2000, and 2007 crashes (log prices). 



of-the-art papers on the topic that all rely on Levenberg- 
Marquart algorithm (see below for remedies). We also 
note that it seems slightly less sloppy for real prices than 
for log prices for the 1987 crash. 



Sloppiness is intrisic to the LPPL equation, not only 
to the 1987 crash, nor just to dangerous times just before 
a crash. In order to convince oneself of this important 
point. Figure [T] plots the four eigenvalues associated with 
the parameters requiring a non-linear fit as a function of 
time in the 150 days preceding the 1987, 2000, and 2007 
crashes on the Hang Seng, chosen randomly; it is obvious 
that the eigenvalues are well-spaced and that their typical 
spacing stays very large in the whole time series; their 
structure is also constant, with no crossing of eigenvalues. 
We have found the same behaviour for all the crashes 
investigated. Note, however, that some crashes lead to 
more sloppy fits than others, i.e., with an even larger 
eigenvalue separation. 



It should be noted that in principle, some sloppy mod- 
els can be unsloppied by a suitable change of fitting func- 
tions. For instance, fitting a function in [0, 1] with a sum 
of exponentials is known to be ill-posed [35^. However, 
using Hermite's polynoms lifts the sloppiness of expo- 
nentials HO]. Unfortunately, this approach relies on a 
symmetry assumption between the parameters that does 
not hold for LPPL. 



Sloppiness has important consequences and, despite its 
negative connotation, these are not only negative. How- 
ever, being aware that LPPLs are sloppy models helps 
understand several important aspects of making predic- 
tions with an LPPL, in particular with respect to the 
uncertainty associated to the most sloppy parameters; 
this will be discussed in the next few sections. 



CONSEQUENCES OF SLOPPINESS 
Sensitivity of tc 

The main result of the previous section is that not only 
are LPPL functions sloppy, but that varying tc together 
with (f) has little influence on square residuals. Reversely, 
changing slightly the input will vary tremendously t^- 
This explains first why the diagnostic of a bubble is some- 
times sensitive to the addition or deletion of a single data 
point. By extension, the sensitivity of tc to noise must be 
investigated and one must understand how reliable can 
the fits of LPPL to noisy data be. 

Quite tellingly, early papers using LPPL to predict 
various kinds of crashes used only a single fit, which, 
of course, is problematic in the light of sloppiness. Re- 
cent papers try to build a probabilistic window for tc 
[H [Ml m] • The problem one faces is to estimate a 
probability distribution for tc from a single noisy time 
series. The methods consists essentially in varying the 
beginning and end of the time series, thereby obtaining a 
distribution of fitted values for tc- But this only happens 
because LPPLs are sloppy and because tc is one of the 
least relevant variables in the fit. Thus, this new method 
uses the intrisic imprecision of LPPL regarding tc- This 
is the positive side of sloppiness. The negative side is of 
course that the imprecision on tc is a priori very large. In 
addition, there is no real guarantee that the distribution 
of tc thus obtained corresponds to anything meaningful. 
As we shall explain below, special methods have been de- 
vised for sloppy functions that are able to give reliable 
probability distributions for fitted variables from a single 
time series. 



Fitting LPPL 

First, using simple fitting algorithms is bound to be 
problematic for sloppy functions (see e.g. the discussion 
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in f5]) as most of them approximate to a first order the 
cost function variation when trying to find the next move 
in the parameter space. In the case of sloppy functions, 
however, one needs to take into account not only the 
gradient, but also the curvature of the cost landscape by 
computing the eigenvectors and following them, which is 
computationally more costly. A computational compro- 
mise is the Levenberg-Marquart method (used by people 
studying LPPL ever since the original paper) which ap- 
proximates the Hessian with a product of gradients, thus 
implicitely assuming that the eigenvectors do not deviate 
much from the axes. While this is a reasonable approx- 
imation as regards some eigenvalues, as seen in Tables ^ 
and|ll] it breaks down in particular for tc'- this means that 
reaching a correct estimate of requires more sophisti- 
cated methods, such as the Rosebroch method f31] or 
the trust region algorithm [9j , at the cost of computional 
time. In this paper, we will restrict our attention to the 
performance and pitfalls of Levenberg-Marquart, hence 
applying such methods is beyond the scope of this pa- 
per. 

Fitting full log-periodic functions with AR(1) noise 

Among the recent progresses, the residuals were shown 
to be AR(1) |16l I26j. It makes sense, therefore, to cre- 
ate artificial data with AR(1) noise. Let us consider 
the very simple case where one adds some noise to a 
pure log-periodic function and applies a fitting procedure. 
More specifically, we fit fLp{t) + crvit) where 77 follows 
an auto-regressive process -qit) — rj{t — 1)(1 — A) + e(i), 
where e ~ A/'(0, 1), A is the memory loss and a tunes the 
strength of the fluctuations. AR(1) noise that mimicks 
the fit of LPPL functions to the 1987 crash is obtained 
with A = 0.06 and a — 2b. A natural test of the predict- 
ing power of the fit to f^p is to consider a time series that 
starts at t = 1 and keeps expanding until tc- We created 
1000 such samples and computed averages of fitted pa- 
rameters for increasing time series length. The average 
estimates of the parameters, quite remarkably including 
tc, do converge to the true value at about 60 time steps 
(2.7 trading months) before the crash itself (Figure |2|. 
Thus it turns out that fitting an LPPL to synthetic data 
generated by an LPPL with a level of noise comparable 
to that of real markets is possible and that the average 
estimate of tc behaves very well ahead of tc- Therefore, 
one concludes that Levenberg-Marquart works well for 
estimating average parameter values for synthetic data 
with many samples. Then a natural crash warning is 
obtained when the average of tc stabilises. 

However, when given a single run, predicting tc is much 
more difficult: the standard deviation on tc is about a 
half of tc — t. Hence, since the residuals are Gaussian dis- 
tributed (we have checked that it is the case), the 80% 
confidence window, as chosen in recent papers on pre- 



dictions with LPPL |3J l^ni m] , corresponds to a width 
of about |(tc — t), hence ranges from tc — {tc — t)/3 to 
tc + {tc — t)/3, while the 95% confidence ranges from t 
to tc + {tc ~ t)- So when a crash warning is issued, the 
crash can occur any day at 95% confidence. Hence, pre- 
dicting the date of a divergence is hard, even when the 
underlying time series is a real LPPL. The 1987 crash 
was chosen because LPPL fits it better than other ones. 
Hence, the above results yield worse results for the pa- 
rameters associated with other crashes. 



DISCUSSION 

Given the attention devoted to LPPL and despite re- 
cent technical developments, it is important to realise 
how sloppy this kind of function is. The sloppiness of 
LPPLs implies that special care must be given when es- 
timating the uncertainty on tc- The leap of faith of LPPL 
regarding bubbles is not the log-periodic nature of oscil- 
lations, but to try to fit data with functions that contain 
a divergence. Thus the discussion on tc is largely discon- 
nected with the nature of the oscillations, as it is only 
related to a way to describe super-exponential growth. 
Obviously one can fit real data with a function that does 
not contain oscillations by setting C = 0, thus focus- 
ing on the super-exponential growth. We tried it on real 
data and while the precision on tc is slightly worse than 
that obtained with a LPPL, it is a simpler method of 
obtaining an estimate for tc- 

The fit of synthetic data with AR(1) noise is most re- 
vealing for several reasons: it first shows that Levenberg- 
Marquart algorithm is adequate for noisy synthetic data, 
that is, when the underlying function is of LPPL type. 
Next, the uncertainty associated with tc in a realistic but 
nice case is quite large and is at the frontier of being ex- 
ploitable. This strongly suggests that making predictions 
with real data is likely to yield worse uncertainties, since 
there is a priori no reason for the oscillations of real data 
to be systematically LPPL-based. Recent work that tries 
to estimate a probabilistic 80% confidence time window 
for tc is certainly a step in the right direction. But since 
the time windows usually proposed are more optimistic 
than the reference case considered here, it is very likely 
that the method used underestimates the uncertainty on 
tc- This is but an example of the problem of estimating 
parameter uncertainty from a single realisation of noise. 
As explained above, tc niay fiuctuate very much when the 
time series given in input is changed slightly because it 
is a sloppy parameter; hence, the mere fact that it does 
fluctuate is not an indication per se that the variance 
of the fluctuations approximates correctly its real uncer- 
tainty. Obtaining trustworthy predictions for sloppy pa- 
rameters from a single time series is possible by Bayesian 
estimation [5J [TS] . Further work will look further in this 
direction. 
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Figure 2. Average and standard deviation of the crash time estimate tc for synthetic data with AR(1) noise and parameters 
reproducing the 1987 Hang-Sen crash. 
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